Selective eye-gaze augmentation to enhance imitation learning in Atari games

نویسندگان

چکیده

This paper presents the selective use of eye-gaze information in learning human actions Atari games. Extensive evidence suggests that our eye movements convey a wealth about direction attention and mental states encode necessary to complete task. Based on this evidence, we hypothesize eye-gaze, as clue for direction, will enhance from demonstration. For purpose, propose augmentation (SEA) network learns when information. The proposed architecture consists three sub-networks: gaze prediction, gating, action prediction network. Using prior 4 game frames, map is predicted by network, which used augmenting input frame. gating determine whether should be fed final predict at current To validate approach, publicly available Human Eye-Tracking And Demonstration (Atari-HEAD) dataset 20 games with 28 million demonstrations 328 eye-gazes (over frames) collected four subjects. We demonstrate efficacy compared state-of-the-art Attention Guided Imitation Learning (AGIL) Behavior Cloning (BC). results indicate approach (the SEA network) performs significantly better than AGIL BC. Moreover, significance through compare random selection gaze. Even case, better, validating advantage selectively using demonstration learning.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning to Play Atari Games

Teaching computers to play video games is a complex learning problem that has recently seen increased attention. In this paper, we develop a system that, using constant model and hyperparameter settings, learns to play a variety of Atari games. In order to accomplish this task, we extract object features from the game screen, and provide these features as input into reinforcement learning algor...

متن کامل

Atari Games and Intel Processors

The asynchronous nature of the state-of-the-art reinforcement learning algorithms such as the Asynchronous Advantage ActorCritic algorithm, makes them exceptionally suitable for CPU computations. However, given the fact that deep reinforcement learning often deals with interpreting visual information, a large part of the train and inference time is spent performing convolutions. In this work we...

متن کامل

The Impact of Determinism on Learning Atari 2600 Games

Pseudo-random number generation on the Atari 2600 was commonly accomplished using a Linear Feedback Shift Register (LFSR). One drawback was that the initial seed for the LFSR had to be hard-coded into the ROM. To overcome this constraint, programmers sampled from the LFSR once per frame, including title and end screens. Since a human player will have some random amount of delay between seeing t...

متن کامل

Distributed Deep Reinforcement Learning: Learn how to play Atari games in 21 minutes

We present a study in Distributed Deep Reinforcement Learning (DDRL) focused on scalability of a state-of-the-art Deep Reinforcement Learning algorithm known as Batch Asynchronous Advantage ActorCritic (BA3C). We show that using the Adam optimization algorithm with a batch size of up to 2048 is a viable choice for carrying out large scale machine learning computations. This, combined with caref...

متن کامل

Deep Learning for Reward Design to Improve Monte Carlo Tree Search in ATARI Games

Monte Carlo Tree Search (MCTS) methods have proven powerful in planning for sequential decision-making problems such as Go and video games, but their performance can be poor when the planning depth and sampling trajectories are limited or when the rewards are sparse. We present an adaptation of PGRD (policy-gradient for rewarddesign) for learning a reward-bonus function to improve UCT (a MCTS a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Neural Computing and Applications

سال: 2021

ISSN: ['0941-0643', '1433-3058']

DOI: https://doi.org/10.1007/s00521-021-06367-y